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Abstract 

Let Y = (yi,2/2j • • •)> 2/1 — Hi — '"> be the list of sizes of the 
cycles in the composition of cn transpositions on the set {1, 2, . . . , n}. 
We prove that if c > 1/2 is constant and n — ► oo, the distribution 
of f(c)Y/n converges to PD(1), the Poisson-Dirichlet distribution 
with paramenter 1, where the function / is known explicitly. A new 
proof is presented of the theorem by Diaconis, Mayer- Wolf, Zeitouni 
and Zerner stating that the PD(1) measure is the unique invariant 
measure for the uniform coagulation-fragmentation process. 

1 Introduction 

Consider the composition Tt t = T t o T t-1 o • • • T 2 o T\ of random, uniform, 
independent traspositions Tj of V := {1, 2, ... , n}. How large must t be in 
order for ir t to "look like" a random-uniform permutation tt of V? As we 
will see, the answer depends on the precise meaning given to the term "look 
like". 

It is easy to check that P[7r(u) = v ] — 1/n for all v E V. Therefore, the 
expected number of fixed points of tt is 1. However, if v does not appear 
in any of the transpositions T 1} T 2 , . . . ,T t , then ir t (v) = v. By the familiar 
solution of the coupon collector's problem, we see that when t = o(n log n), 
the probability that iTt has at most one fixed point is small. In this sense, 
iT t and 7r are rather different when t = o(n log n). On the other hand, when 
t > cnlogn, c > 1/2, the total variation distance between the law of ir t and 
that of tt tends to zero as n —>■ oo |DS81j . 

We now consider the situation where t < cn with c < 1/2. Let C7* be the 
graph on V = {1, 2, . . . , n} where {v, u} is an edge in G 1 if and only if the 
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transposition (v, u) appears in {T 1; . . . , T t }. Let Vq denote the set of vertices 
of the largest connected component of G l (with arbitrary tie breaking if there 
is more than one). By the Erdos-Renyi Theorem, when t < cn, c < 1/2, we 
have \Vq\ = O(logra) asymptotically almost surely (a.a.s.). It follows that 
the largest cycle (orbit) of n t is also of size O(logn). When c = 1/2, the same 
holds, but with logn replaced by any function growing faster than n 2 ^ 3 . This 
contrasts with the fact that for every k G {1,2, ... ,n} the probability that 
the cycle of tt containing 1 has size < k is precisely k/n. Thus, n t is very 
different from tt when t/n < c < 1/2. 

Our main theorem deals with the case where t/n > c > 1/2. Confirming 
a conjecture by Aldous, we prove that in this range, the large cycles of ir t , 
when normalized by their total length, have a distribution that is close to 
that of the large cycles of tt. A more precise statement of this result will be 
given shortly. 

Let a be some permutation of V. Let X(a) denote the set of cycles 
(orbits) of elements of V under a. The cycle structure X(<r) is then the 
sorted list of the lengths of the cycles, that is, the list (\C\ : C G A(cr)) sorted 
in nonincreasing order. Thus, Xj(cr) denotes the size of the i-th largest cycle 
of a. If i is larger than the number of cycles of a, then we set £j(cr) = 0, by 
convention. Since each Tj is chosen uniformly among the transpositions, it 
follows that for each fixed permutation a of V the distribution of TT t is the 
same as that of a o 7r t o o~ x . Thus, the distribution of 7t t is determined by 
the distribution of the conjugacy class of n t . Now, the conjugacy class of n t 
is determined by X(7i t )- Consequently, the distribution of X(7r t ) determines 
the distribution of 7r t . 

We are now ready to state our main theorem, which gives a positive 
answer to a conjecture by David Aldous as stated in [BDJ. 

Theorem 1.1. Let c > 1/2, and take t > cn. As n —> oo ; the law of 

X(7T t )/|Vg.| converges weakly to the Poisson-Dirichlet distribution PD{1) with 
parameter 1 (which is defined below). 

A more explicit statement of the theorem is as follows. Given c > 1/2 
and e > 0, there is an n(c, e) such that for every n > n(c, e) and every t > cn 
there is a coupling of the sequence of transpositions 7} and a PD{1) sample 
Y such that 



|r-X(7r t )/|^||| oo <e 



> 1 -e 



2 



Weak convergence has several equivalent formulations (see |Dud89j ). and 
we have opted to use the coupling version here. 

The PD(1) distribution is a probability measure on the infinite dimen- 
sional simplex 

fi:= {ye[0,l]* + :Y^Vi = l,Vi>V2>---} 

i 

and may be defined as follows. Let U\, U2, ■ ■ ■ be an i.i.d. sequence of random 
variables uniformly distributed in [0, 1]. Set x\ := U\ and inductively Xj := 
Uj (l — Y^lZi x i) • Let (i/i) be the sequence (xi) sorted in nonincreasing order. 
The PD(1) distribution is defined as the law of See, for example, |Hol01] 
for other definitions and a discussion of some of the properties of the Poisson- 
Dirichlet distributions. 

The behaviour of the size of the largest cluster of G*, which is the nor- 
malizing quantity \Vq\ in the theorem is known precisely. The Erdos-Renyi 
theorem (see, e.g., [ASOOj ) tells us that 

\V£\/n -> z(2t/ri) (1.1) 

in probability as n — > 00, where z(s) is the survival probability of a Galton- 
Watson branching process with offspring distribution which is the Poisson 
random variable with mean s. Moreover, z(s) is the positive solution of the 
equation 1 — z — exp(— s z) when s > 1 and z(s) = for s £ [0, 1]. 

Berestycki and Durrett [BP] have analysed other aspects of the chain ir t 
which exhibit a phase transition near t = n/2: they investigate the minimal 
number of transpositions necessary to write TT t as a composition. 

In [DMP95J, Diaconis, McGrath and Pitman discuss the Riffle shuffle, 
which is another example where the large cycles appear relaxed well before 
the permutation is uniformly distributed. 

The evolution of X(ir t ) is also known as the discrete uniform coagulation- 
fragmentation process. Let us briefly describe the transition from X(ir t ) to 
X(n t+ i). Suppose that T t+ \ is the transposition (a, b). Then a and b are 
selected uniformly from V, and are "almost indepedent". (We could also 
allow a = b, then T = (a, a) would be the identity transposition, and a and 
b would be independent. That would not change anything significant in the 
following.) Let Xi,Xj £ X(n t ) satisfy a £ Xj, b £ Xj. Then Xi and Xj are 
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size biased selections from X(iT t ), and are nearly independent given n t . If 
Xi 7^ Xj, then in 7r t+1 the two cycles X; L and Xj are replaced by the single 
cycle whose vertices are Xi U Xj. If Xi = Xj, then this cycle splits into two 
cycles of 7Tt+i- If k — \X{\ and m G N + is the least positive integer satisfying 
7r™(a) = b, then the resulting two cycles of ir t+ i are (a, n t (a), . . . , 7r™ _1 (a)) 
and (b,ir t (b), . . . , 7rf~ m ~ 1 (6)) . Note that given Xi and given Xj = Xj, the 
resulting two new cycles have sizes m and — m, where m is chosen 
uniformly in {1, 2, . . . , |Xj| — 1}. 

There is a similar continuous coagulation-fragmentation process, which is 
a discrete time Markov chain on the infinite dimensional simplex Q. The tran- 
sition kernel M of the chain operates as follows. Given Y = (Yi, Y 2 , . . . ) G 
f2, we choose two indices i,j G N + independently, with P[i = k Y] = 
P = I y] =Yk- If z 7^ j, then let K' be obtained from Y by replacing the 
two entries Yi and Kj with the single entry Yi + Yj and resorting. If i = j, 
then given (Y,i,j), a random variable u is selected uniformly in [0,1^] and 
Y' is obtained by splitting the entry Yi into the two entries v and Yi — v, and 
resorting. Then Y' is the new state of the Markov chain. 

It is known that the probability measure PD(1) is invariant under M. 
Apperently, this was first proved in |Wat76j ; references for several other 
proofs of this fact are given in jDMWZZj . Vershik conjectured that PD(1) 
is the only invariant measure. Subsequently, this was proved by Diaconis, 
Mayer- Wolf, Zeitouni and Zerner: 

Theorem 1.2 ( |DMW ZZJ). The invariant measure for M is unique. 

See [DMWZZJ for more information and bibliography regarding the his- 
tory of the problem, including some earlier established special cases. 

The proof of [DMWZZJ relies on coupling the discrete and the continuous 
coagulation-fragmentation processes, and using representation theory on the 
symmetric group to understand the discrete process. In the present paper, we 
use a different coupling to handle the continuous process directly, and thereby 
give a different proof of Theorem 11.21 Moreover, a slight modification of this 
coupling will be essential in the proof of Theorem 11.11 

The problems addressed in this paper are a mean-field version of a sta- 
tistical physics model suggested by Toth [T6t93] , which may be described 
as follows. Consider a locally finite graph G = (V, E) , and fix a parameter 
(3 > 0. For each (unoriented) edge e G E, let Z e C [0, 1] be an independent 
Poisson point process of intensity (3 on [0, 1]. Let v G V. We now describe 
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a walk v(t) starting at v (0) = Vq. Let t\ be the first t > such that there 
is an edge e\ = [v ,Vx\ incident with v such that ti G Z ei + Z. If there is 
no such t\, then t>(t) = vq for all t > 0. But if t\ exists, then let v (t) = vq 
for t G [0, ti) and v(ti) = V\. Inductively, assume that tj and Vj are de- 
fined and v(tj) = vj. Let tj+\ be the first t > tj such that there is an edge 
e J+ i = [vj,Vj + i] incident with Vj such that t G Z ej+1 + Z; set = Vj for 
t G (tj,t j+1 ) and = ^+1- 

In the case where G is the complete graph on V, it is easy to see that the 
orbit of 1 in 7r t is analogous to the range of this walk starting at 1, where 
(3 = t/n. The essential difference between the two is the distinction between 
continuous time and discrete time. 

There are several known open problems regarding Toth's model. Is it true 
that for (connected) bounded degree graphs G, the simple random walk on 
G is transient iff Toth's walk Vj visits infinitely many vertices with positive 
probability for some (3 > 0? In particular, is this true for G = Z d ? For finite 
graphs G, one may ask about the distribution of the size of the image of 
the walk {vj}, for example. See Ang03| for an analysis of Toth's model on 
regular trees and for a list of some open problems, including those mentioned 
above. 

Returning to the symmetric group, one may ask about the typical cycle 
structure near the transition point t = n/2. A very thorough analogous 
theory exists for the Erdos-Renyi transition. See, for example, |Spe94[ [XS00, 
IJLROOj and the references cited there. 
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Notations 



For the convenience of the reader, we list here some of the notations used 
extensively, with hyperlinks and page numbers of the definitions, and a brief 
description, where appropriate. 



V 


{l,2,...,n} 


m 


T 1; T 2 ,... 


i.i.d. uniform transpositions on V 


m 




T t o T t _! o • • • o Ti 


m 


X(a) 


set of cycles of a permutation a 


2 


X s 


X(n s ) " 


7 


X(a) 


cycle structure of a 


2 


X s (v) 


cycle of 7r s containing v 


7 


vm 


set of cycles of ir s of size at least k 


7 


G* 


graph whose edges correspond to transpositions Tj, % <t 


m 




largest cluster in G l 


2 


vm 


union of clusters of G t of size at least k 


7 


z{s) 


function in the Erdos Renyi theorem 


3 


Q 


{ye [0,1] N + :Ei2/i = l» 2/i > 2/2 > • • ■ } 


3 


PD(1) 


Poisson-Dirichlet distribution with parameter 1 


3 


M 


coagulation-fragmentation transition kernel 


m 


M 


the coupling 


EH 


IiY, Z) 


indexes of matched entries 


m 


Q 


sum of matched entries 


da 


Y,Z,Y,Z 


partitions used in defining M 




u, V 


random variables used in the definition of M 




6 


e + fragments smaller than e 




N l 


unmatched entries larger than e 




yi,4 


largest unmatched entries in Y* and Z l 
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2 Big pieces 



The main goal of the present section is to show in a quantitative way that 
most vertices in Vq are in reasonably large cycles of ir t . 

Suppose that n is a permutation on V and T = (x, y) a transposition. If x 
and y are in different cycles in n, then in Ton these two cycles are joined, and 
the other cycles remain unchanged. Now suppose that C = (xq,Xi, . . . ,x m ) 
is a cycle of n which contains x and y. Say, x = Xj, y = Xi, and j < i. 
Then in T o n the cycle C is split into the cycles (xi,Xi + ±, . . . ,Xj-i) and 
(xj, Xj + i, . . . , x m , x , xi, . . . , Xi-i). The other cycles remain unchanged, of 
course. This clearly implies the following 

Lemma 2.1. Let n be a permutation of V and s E N. Let T be a uniform- 
random transposition on V . Then the probability that some cycle of tt is 
split inT o it into two cycles at least one of which has length < s is at most 
2s/{n-l). □ 

This will be used in the next lemma. Let X s = X(ir s ) be the set of 
cycles of n s and for v e V let X s (v) be the cycle in X s containing v. Let 
V^{k) C V be the union of those connected components of G s which have 
at least k vertices, and let V£(fc) C V be the union of the cycles in X s that 
have at least k vertices. 

Lemma 2.2. 

E\V£(k)\V£(k)\ < Ask 2 / \n -1) 
holds for every k,s EN. 

Proof. Let / be the set of t E N such that there is a cycle A E X 1 ^ 1 which 
splits into two nonempty cycles in X f , A = A-y U A 2 , Ai,A 2 E X 1 and at 
least one of these cycles, say Ai, satisfies \Ai\ < k. The above lemma shows 
that P[t E I] < 2k/(n - 1) for every t E N, and hence E[|/n [0,s]|] < 
2sk/(n-l). 

Suppose that C E X s , \C\ < k and C C V^k). There must be some 
vertex u EC and some time t < s such that < |X* _1 (-u)|; otherwise, 

C would be equal to a component of G s . Among all such possible pairs (u, t), 
we choose one that maximizes t. Then we have X l (u) C C. Consequently, 
t E I fl [0, s] and at least one of the two elements of V transposed by T t is in 
C. Therefore, the number of such C is at most 2\l n [0, s\\. The statement 
of the lemma now follows from the above bound on El / D [0, s\ I. □ 
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The following lemma will tell us that if X t has many vertices in reasonably 
large cycles at time t = t , then with high probability at a specified later time 
t\ most of these vertices will be in cycles of size at least e n. 

Lemma 2.3. Let 5 G (0,1], t ,j G N, and e G (0,1/8). (The lemma will 
be useful primarily when (logn) 2 < 2 J < n a with any constant a < 1/2.) 
Assume that 2 j < e5n and that P [|V£(2 J ')| > Sn] > 0. Set p := 2 j /n and 

t 1 :=t +[2 6 r 1 p" 1 log 2 (p- 1 )]. (2.1) 

Then the number of vertices v that are in cycles of size at least 2 J at time to 
but are not in cycles of size at least e5n at time t\ satisfies 

<0(l)r 1 e|log(e5)|n, (2.2) 

where the constant implied in the 0(1) notation is universal. 

Two important aspects of this lemma are that the right hand side of (12.21) 
does not depend on j and that t\ does not depend on e. (However, t\ — to 
depends primarily on j and the right hand side of (12. 2p depends primarily 
on e.) 

Before we begin with the actual proof, here is an informal outline. Let 
v G Vx(2.i). Set K := |~log 2 (e<5n)~|. We will choose a sequence of times 
Tj, Tj+x, . . . ,tk- For s = j, j + 1, . . . , K, when t G [r s , t s +i) we will "expect" 
the size of X l (v ) to be at least 2 s . This can fail in either of two scenarious: 
it may happen because a transposition cuts the cycle of v, or it may hap- 
pen because no transposition merges the cycle of v with a sufficiently large 
cycle. The probabilities for each of these unfortunate situations will be ap- 
propriately estimated. The choice of the time interval r s+ i — t s is somewhat 
delicate. If it is too long, then perhaps too many cycles will be cut, while if 
it is too short, then cycles will not have enough time to merge. It turns out 
that 

a s := 2 4 <T 1 2- s ([log 2 7i] - s) (n - 1) . 

is roughly the right choice, as will become clear in the course of the proof. 

Proof. Within the proof below, expectations and probabilities will be condi- 
tioned on \Vx°(2 j )\ > Sn. Let K := \log 2 (edn)}, and let a s be as above. For 
s > j let m s := \a s ] and set rrij := t 1 -t -Y,f=j+i m s- Set r s := t +YZZj m i- 
Note that tk = t% and a s < m s < 0(a s ) for s = j, j + 1, . . . , K — 1. 



E 
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Let s G {j,j + l,...,K — 1} and t G {r s + 1, r s + 2 . . . , t s+1 }. Define 
F l C V to be the set of vertices t) 6 7 such that |X*(t>)| < |X* _1 (f)| and 
\X\v)\ < 2 S+1 . Lemma O shows that E|F*| < 2 2s+A /(n - 1). We also set 

& ■= uu+i FT - Then 

A'-l 

EIF* 1 1 < m * 2 2s+4 /(n - 1) = 0(e | log(e 5) | n) . 

We consider the vertices in F* as vertices "failing" at time t. However, 
there are other ways in which vertices can fail. If at time t G {t s ,t s + 
1, . . . , t s+ \ — 1} we have |V£(2 S )| < 5n/2, then we consider the whole process 
as failed, and we set H l := V. Otherwise, take H l = 0. Also set H l : = 

The third and last way in which a vertex v may fail is if X l (v ) does not 
grow in time. Let 

B s := V£(2 S ) \ (F rs+1 U H Ta + 1 ~ 1 U Vj s+1 (2 S+1 )) , 

and B s := Ufc=j B k . The vertices in B s are vertices whose cycles failed to 
grow sufficiently between time r s and time r s+ i. It is clear that 

Vg (2 j ) C Vj} (e5n)U if* 1 U F* 1 U B K . (2.3) 

If v G B s , then it must be the case that for every t G [t s , t s +i — 1] we have 
|V^(2 S )| > 5n/2 (since B s is disjoint from H Ts + 1 - 1 ) andw G V^(2 S )\\4(2 S+1 ) 
(since B s is disjoint from F Ts+1 ). If we condition on 2 s < \X t (v)\ < 2 S+1 and 
on |V^(2 S )| > 5n/2, then there is probability at least 

2 s {5n/2-2 s+1 ) Q >2 s - 3 5{n-l)- 1 

that T t+1 transposes an element from X l (v ) and an element from some other 
cycle of X 1 whose size is at least 2 s . If that happens, then v G V^(2 S+1 ) and 
this implies that v cannot be in B s . Consequently, 

P[v G B s ] < (l-2 s - 3 5/(n-l)j 

< exp(-2 s - 3 5m s /{n - 1)) < 0(2 s /n) . 
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Hence, 

E\B K \ < 0{l)n2 K n~ l = O(eSn). 

It follows from the definition of H l that in order for H l to be nonempty, 
we must have \F t U B s ~ 1 \ > 5n/2. Therefore, 

EliJ' 1 ! < n¥[\F tl VJB K \ > Sn/2] < 2 S~ 1 'E\F tl U B K \ . 

When we combine this with ( 12. 3p and the above estimates for Eji 7 * 1 ! and 
E|£> 7 ^|, the lemma follows. □ 

Lemma 2.4. Fix some c > 1/2, and let t > cn, t EN. Let e,a E (0,1/8) 
and let N be the minimal number of cycles in X 1 which cover at least (1 — 
e) \ Vq\ vertices oJVq. Then 

P[N > oT x |log(ae)| 2 ] < C x a 

for all n> n\, where C\ is a constant which depends only on c, and n\ may 
depend on c and e. 

Proof. First, suppose that t < n 5//4 . Choose j such that n 1//4 < 2 J < 2n 1//4 . 
Let 5 = z/2, where z = z(2t/n) is the Galton- Watson survival probability 
discussed in the introduction. Choose t so that (12. ip holds with t in place 
of ti. Note that t — t = 0(n 3 ^ logn). (Here and below, the constants in the 
O(-) notation may depend on c.) We apply the Erdos-Renyi theorem at time 
to to conclude that a.a.s. | V^ 1 — n z = o(n) and the second largest component 
of G to has size less than (logn) 2 . Lemma [2.21 with k = 2 J and s — to implies 
that \V^\Vx°(2 j )\ < n 7 / 8 a.a.s. Note also that |V^\^°I < (* - *o) O(logn) 2 
a.a.s., because we know that the components of G to other than the largest 
one are typically smaller than (logn) 2 . Hence \Vq \ V^(2 : ')\ < n 7//8 a.a.s. 
Now, Lemma 12.31 implies that for every fixed e' > and for every sufficiently 
large n 

< 0(l)e'|loge'|n- (2.4) 

(Note that \Vg\ < n, and hence the conditioning in (12.21) may be ignored 
once n is large enough so that P [|V^-°(2 J )| < 5n\ < e' \ loge'|.) 

Now, to show that (12.41) holds also without the assumption that t < n 5//4 , 
we note that Lemma [2.31 may be applied with j = 0, 5 = 1 and to chosen so 
that (12. ip holds with t in place of t\. (In this case, we do not need to use 
Lemma 12.21 ) 



E 
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Set a(k) := \Vq\ Vx{k)\- Let i$ be the smallest integer i such that 
a(2~ l n) < en/2. Then N is bounded by the number of cycles in V|-(2~ l0 7i) D 
Vq. Let z'i be the least integer such that 2~ l1 < a e/| log (a e)|. Then (12.41) 
shows that P[i > z'i] = 0(a). We may write 

a(jfe) = :icl^,ie X', |A| < A;} . 

By considering the contribution of each cycle to the sum 

m 

S m := ^o(2-*n) 2*/n 

i=0 

we find that N = O(Si ). On the other hand (12.41) implies that 
E[S h ] < 0(1) J2 i < °iX) ( ? i) 2 < 0(1) I log(« e)| 2 . 

i=0 

Because P[z Q > i±\ = 0(a), this completes the proof. □ 

3 Coupling 

At this point, it seems likely that the proof of Theorem II. II can be completed 
using some of the results from the work of Diaconis, Mayer- Wolf, Zeitouni 
and Zerner [DMWZZJ. However, we prefer instead to use a different coupling 
argument to finish off the proof and also prove the main result of [DMWZZJ. 

We now describe a coupling in the continuous setting. A similar coupling 
will also apply to couple between the discrete and continuous setting, but the 
purely continuous setting avoids several annoying minor notational issues. 

The coupling is between two Markov chains Y l and Z 1 starting at possibly 
different initial starting points Y°, Z° £ Q with each separately evolving 
according to the transition kernel M. 

In this coupling, the evolution of (Y f , Z*) will also be Markov. Its tran- 
sition kernel will be denoted by M. 

The basic idea in the construction of M is that if we have entries in Y l 
that are equal to entries in Z*, then we don't want to ruin this. Consequently, 
if we make a change to such an entry in Y*, we want to make a corresponding 
change to the corresponding entry in Z 1 . On the other hand, as much as we 
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can, we do want to produce new entries in Y* and Z l that match. Our 
measure of the discrepancy between Y* and Z l will roughly be the number 
of large unmatched entries, and we will strive to reduce the discrepancy. 

In order to define M, we need some more notations. Let (Y, Z) G f2 2 . We 
will need to match entries in Y with entries in Z of the same length, if such 
exist, and match as many entries as possible. The matching will be encoded 
via maps fz,y, fy,z '■ N + — > N, which are defined as follows. Let i G N + , let 
H be the set of j G N+ such that Yi = Zj, and let k := \ {j G N : j < i, Y j = 
Yj} . (Partly because we want to easily generalize to the discrete setting, 
we do not want to rule out the possibility that Yi = Yj for some % ^ j.) If 
| if | < k, then set fy t z(i) = 0. Otherwise, let fy t z(i) be the fc'th smallest 
element in H . (By exchanging Y and Z, this also defines the map fz,y) Let 

I(Y, Z) := /^(N + ) = {i G N : f Y , z (i) ± 0} . 

The entries Yi with i G I(Y, Z) will be referred to as matched. Likewise, Zj, 
j G I(Z,Y), are the matched entries of Z. Observe that f z ,y ° fy,z{i) = i 
for every i G I(Y,Z), f YyZ (I(Y,Z)) = I(Z,Y) and Z fYz{ i) = Y { for every 
i G I(Y,Z). Let 

Q = Q(Y, Z) := ^{F, : i G I(Y, Z)} = = J e I(Z, Y)} . 

We will now describe the transition kernel M. Given (Y, Z) G Q, we 
need to perform one step of M for each of Y and Z, thereby generating 
new configurations Y' and Z' . We associate with Y and with Z partitions 
Y = (Yi : i G N + ) and Z — (Zi : i G N + ) of [0, 1] into closed intervals, 
as follows. (See also Figure IXT1 ) The length of the interval Yj is Yj. The 
intervals Yj with % G J(Y, Z) tile the interval [1— Q, 1], while the intervals with 
i I(Y, Z) tile the interval [0, 1 — Q}. Within each of these classes, let the 
intervals be ordered according to the indices; that is max Yj < min Yj/ if i < %' 
when G I(Y, Z) and when <£ I(Y, Z). A partition Z = (Zj : j G N+) 
is constructed in the same way. Note that necessarily Zf YZ ^ = % whenever 
i G I(Y,Z). 

Let u and v be two independent uniform random variables in [0,1]. Let 
a, a' G N+ be the indices satisfying u G Y a and u G Z a i . In this way, u 
induces a size biased sample from Y and from Z. We will use v to induce 
a different size biased sample, based on different tilings of [0, 1]. Let Y be 
the tiling (Yj : % G N+) of [0, 1] by intervals that is obtained from Y by 
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Figure 3.1: The random variable u chooses a segment in Y and a segment 
in Z. The different illustrated choices for the random variable v yield a split 
in Y and in Z, a merge in Y and a split in Z, a merge in both where a 
matched segment is not involved, and merges involving matched segments, 
respectively. 
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shifting the interval Y a to the begining. (That is, Y a = [0, FJ, Y i = Y a + F if 
maxF < minF a and Fj = F if maxF a < min Fj.) Similarly, Z is the tiling 
obtained from Z by shifting Z a > to the begining. Note that Zf YZ ^ = Yi 
whenever % E I(Y, Z). 

Let b and b' be the indices satisfying v E Y b and v E Zy. If a ^ 6, let F' 
be obtained from F be replacing the two entries Y a and Y b by the single entry 
F a + F> and resorting. If a = b, let Y' be obtained from Y by replacing F a 
with the two entries v and Y a — v and resorting. Similarly, if a' ^ b', let Z' be 
obtained from Z be replacing the two entries Z a > and Zy by the single entry 
Z a i + Zfc/ and resorting. If a! = b', let Z' be obtained from Z by replacing 
Z a / with the two entries v and Z a > — v and resorting. This completes the 
construction of the Markov transition kernel M. 

Let us observe a few essential features of this coupling. If F and Zj are 
split, then one of the two new entries in each of Y' and Z' is equal to v. 
If i E I(Y,Z) and F is split or merged, then the same happens to Zf YZ ^. 
Similarly, if j E I(Z, F) and Zj is split or merged, then the same happens to 

Y fz,Y(j)- 

We first informally describe the general behaviour of M, postponing the 
exact statements and proofs. When there are several unmatched reasonably 
large entries in F* and in Z f , these merge and become few quite quickly. 
However, when they are very few, it is hard for them to dissappear completely. 
Suppose that there is one large unmatched entry in F* and two unmatched 
entries in Z l . When the two unmatched entries in Z l are merged, the single 
unmatched entry in F* is likely to be split. Thus, the situation does not 
improve so quickly. There is a parity phenomenon here: if the number of 
positive entries in F* is finite, then its parity either stays the same as that 
of t, or is opposite to that of t. Even if the number of positive entries is 
infinite, if it takes a long time for the smaller entries to be hit, the larger 
entries appear to follow this parity periodicity. One way to handle the parity 
issue would be to introduce a delay to either F* or Z f , but not both, in 
order to match up their parities. However, another phenomenon will be 
used instead. An unmatched entry in F* often splits into one matched entry 
and one unmatched entry. With any luck, the unmatched entry might be 
rather small. Thus, large unmatched entries are replaced by small unmatched 
entries. In effect, there is a diffusion of unmatched entries between different 
scales. Because of this, it is eventually unlikely to find a large unmatched 
entry, which is what we want to prove. However, this latter process is much 
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slower than the first stage where large unmatched entries merge and become 
fewer. Thus, in time t the largest unmatched entry one can expect to find is 
of order roughly 1/ logt. 

Define 

N e (Y, Z) := | {i E N + \ I(Y, Z):Y i >e}\. 

This is the number of entries in Y that are not matched by entries in Z and 
have size larger than e. 

Lemma 3.1. Let e > 0, and let Y°,Z° E tt. Let {Y\Z l ) be the Markov 
chain given by M starting at (Y°,Z°). To abbreviate notations, set N l : = 
N.iY^Z 1 ) + N,{Z\Y l ), Q l = Q(Y\ Z % ) = Q{Z\Y l ), P := IiY^Z 1 ) and 
J 1 := I{Z\Y l ). Also define 

e:=e + J2{Yr-Y°<e} + J2{Z? 

Let y\ := maxjF/ : i ^ /*} be the size of the largest unmatched entry ofY 1 
(set y\ = if all entries are matched), and let z\ be the size of the largest 
unmatched entry of Z l . Let q be a random variable with values in N which 
is independent from the evolution of the chain Set 



max [P[q = t] :teN}. 



Then 



E 



;i-Q«)(l-Q«-max{j/?,*?}) 



< -N° + 4eE[g 
- 2 w 



+ 1]. (3.1) 



When the right hand side in (13. ip is small, we know that with high prob- 
ability either the sum of the unmatched entries in Y q is only slightly larger 
than the largest unmatched entry, or this is true for Z q . 

Proof. Let A s be the event that up to time s in every merging occuring 
both merged pieces are of size at least e and in every splitting both resulting 
pieces are of size at least e. Let T s be the a-field generated by ((Y*, Z l ) : 
t — 0,1, ... ,s). Conditioned on J-'t-i, the probability that at time t there is 
a split in any Y*~ and one of the pieces is of size less than e is at most 2 e. 
Conditioned on J-'t-i, the probability that there is any Y^~ x with Y%~~ < e 
that is merged at time t with some other Y^~ l is at most 2 XX^/ -1 : rf^ 1 < 
e}. Similar considerations apply to Z l . Consequently, for t G N+, 



P[-v4 t | At-i, <4e. 



(3.2) 
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We now study the evolution of the quantity JV*, and consider several 
different cases for the transition from (Y t ,Z t ) to (Y t+1 , Z t+1 ). In each case 
we assume that At+i holds. 

1. The transition involves splitting in Y* and merging in Z l . Suppose that 

Y* is split and Z l - is merged with Z 1 -,. Then necessarily % ^ P and j,j' 

P. Since A t+l is assumed to hold, it follows that N e {Y t+l , Z t+1 ) < 

N e (Y\ Z f ) + 1 and N e (Z t+1 , Y t+1 ) < N e (Z\ Y r )-1. Thus, in this case, 
N t+i < N t_ 

2. The transition involves splitting in Z l and merging in Y*. By symmetry, 
also in this case we have N t+1 < TV*. 

3. The transition involves splitting in Y l and splitting in Z l . Note that 
by construction the size of one of the newly created split entries is the 
same for Y as for Z. Suppose that Y* and Zj are split, li i E P then 
also j E P and Y* = Z 1 -. In that case, both new entries for Y are 
the same as the new entries for Z, and hence iV i+1 = N f . The same 
conclusion is obtained if j G P. If i ^ P and j ^ P, then in both Y l 
and Z l an unmatched entry is replaced by two entries at least one of 
which is matched. Thus iV m < N*. 

4. The transition involves merging in Z l and merging in Y l . Suppose 
that Yl is merged with Y*,. It is easy to verify, as above, that in this 
case also iV m < N*. However, if P, then the corresponding 
statement is also true for the merged entries in Z f , and we actually 
have N t+1 < N l — 2. 

In summary, we see that on the event A t +\ we have iV m < A^* and N t+1 < 
N l — 2 when there is merging in both Y l and Z l and the merging does not 
involve matched entries. 

Since iV* > 0, we obviously have 

oo 

^(iy*-iv t+1 )U +1 <iv , 

t=0 

and we have seen that all the summands are nonnegative. Since q is inde- 
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pendent from (JV* — N t+1 )l At+1 , 

E[(N« - N q+1 ) l Aq+1 ] = £e[(JV* - iV< +1 ) 1 ?= J 

= £e[(jv* - iv m ) U+J P[g = t] < v n° . (3 ' 3) 

t 

Set a* = 1 — Q l — max{y*, z{}. Recall the random variables u and v used 
in the transition kernel M. If in the transition from (V*, Z*) to (F' +1 , Z' +1 ) 
we have u < 1 — Q t and max{|/J, z^} < v < 1 — Q f , then in both Y" and Z 
have merging of unmatched entries. Thus, 



we 



N t _ N t+i > 2 Qr 



■t+i 



> (1 - Q*) a* 



By applying this at time t = q and taking expectations, we get 
E [(1 - Q q ) a q ] < P [N q - N q+1 > 2 or ^A q+1 ] 

<\v[{N q -N q+1 )l Aq+1 ]+V[-,A q+l ]. 

Consequently, (13.21) and (13. 3p complete the proof of the lemma. □ 

Assuming that we can make the right hand side of (13 .ip small, Lemma |3~T1 
tells us that with high probability either l — Q q — y\ or \ — Q q — z\ is small. If 
we knew that both are small, it would follow that also yf — zf is rather small, 
since X^^T 7 = Ylj = 1- However, it might be the case that 1 — Q q — z\ 
is small but 1 — Q q — y\ is not. The next lemma tells us that in such a 
situation, with high probability, Y q does not have more than two significant 
unmatched entries. 

Lemma 3.2. With the setting and notations of Lemma lff.il let y\ be the 

second largest unmatched entry in Y t . For every p e (0, 1) 

P [1 - Q q - y\ - y\ > p] <2 6 p~ 4 r]N + 2 9 ep- 4 E[q + 2}. (3.4) 

Proof. Let V be the event {1 — Q q — y\ — y\ > p} and let 1Z be the event 
{l-Q q -z q < /a/4}. Assume that VnTZ holds. Then z\ > 3p/A + y q + y q . Let 
U be the event that the random variables u and v used in the transition from 
(Y q , Z q ) to (Y q+ \ Z q+1 ) satisfy u < 3p/4 and z\ - p/2 < v < z\ - p/A. On 
DdlZnU, the largest unmatched entry in Z q will be split and the transition 
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from Y q to Y q+1 would involve a merge (of unmatched entries), because 
z\ — p/2 > yi. Consequently, a.s. on V fl 1Z D U the two new entries of Z q+l 
will be unmatched in Y q+1 , and in particular, 1 — Q q+1 > z\. Moreover, each 
of the new entries of Z q+l would be larger than p/4. Clearly, yf +1 < y\ + y%- 
Consequently, l-Q q+l -y\ +l > z\-y\-y\ > 3p/4 and l-Q q+l -z\ +l > p/4. 
Thus, onDnKnW.we have 1 - Q q+1 - max{?/? +1 , z\ +l } > p/4. Now, 



E 



[l-Q q + 1 )(l-Q q + 1 -m^{y\ + \zl +1 }) 



> E 



(1 - Q q+1 - max{^ +1 , z q+1 }) V,TZ,U P [V, K, U] 

>(p 2 /16)P[P, n, U\. 

Lemma 13.11 with q replaced by q + 1 therefore gives 

P[P, n, U] < 8p- 2 7]N° + 2 6 ep~ 2 E[g + 2]. 

Clearly, P[U\V, Tl] = 3p 2 /16, and hence 

P [V, K] < (16/3) /T 2 P [V, n, U] . 

On the other hand, on V\1Z we have (1 — Q q ) (l — Q q — max{yf, z\\) > p 2 /4. 
Thus, applying Lemma 13.11 again gives 

P[V\Tl] <2p- 2 7]N° + lQep- 2 E[q + l}. 

Since P[P] = P[2^, 7^]+P[P \ 72-], the above estimates combine to give f!3.4jl . 
and complete the proof. □ 

Lemma 3.3. With the setting and notations of Lemma \3.1[ Let p G (0, 1/8) 
and assume that < e < p. Then for each t G N + and for every n G N + 
satisfying 2 n < t p 

t-i 

r^PH>p] <0(p- 1 n- 1 ) + 0(2 4 7p 5 )(iV7t + et). (3.5) 

r=0 

The basic idea of the proof of the lemma is to use the fact that conditioned 
on y[ > p there is a significant enough probability that at a later time a there 
will be some unmatched Y$ G [2~ k p, 2~ k+1 p], since the unmatched piece at 
time t of size > p may be split immediately. Lemma [3721 is then used to show 
that when we fix a, with high probability the latter event occurs for at most 
three different k in the range {1, . . . , n}, if n is not too large. An appropriate 
summation over k and a completes the proof. 
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Proof. For a > r, o, r E N, k E N + , let X{r,a,k) be the event that the 
transition between time r and r + 1 produces a splitting in Y T and one of 
the split pieces is unmatched, has size in the range [2- k - l p,2~ k p), and this 
split piece is not modified up to time a. Set 

cr-l 

X'{r,a,k) := X(r,a,k) \ (J X(r',a,k). 

r'=r+l 

Suppose that y[ > p and that Y? — y\. If in the transition from r to r + 1 
we have u E Y[ and v G (Y? — 2~ k p, Y[ — 2~ k ~ 1 p), then Y[ is indeed split, 
and it is easy to see that the resulting piece of Yf — v is unmatched a.s. 
If that happens, the conditioned probability that up to time a this piece is 
modified is bounded by 2 (cr — r) 2~ k p, since the size of this piece is at most 
2~ k p. This gives 

P [X(r, a, k)\yl>p] > 2- fc -y (l - 2 (<r - r) 2~ k p) . 

On the other hand, the conditional probability for X(t', cr, k) given the con- 
figuration at time r' is clearly at most 2~ k p. Hence 

P [X'(t, a,k)\yl>p]>P [X(t, a, k)\y\> p] (l - (a - r) 2' k p) 

> 2- k - l p 2 (1 - 2 (a - r) 2~ k p) (1 - (a - r) 2~ k p) 

> 2- k - 1 p 2 (1 -3(cr-r)2- fc p), 

which implies 

P [yl >p}< 2 k+3 p- 2 P[X'(t, a,k)}, if r < cr < r + 2 k ' 2 / p . (3.6) 

Let V a (n) be the event that there are at least 3 distinct k E {0,1, . . . ,n— 1} 
such that there is an unmatched Yf in the range [2~ k ~ 1 p, 2~ k p). We now 
apply Lemma 13.21 with q chosen uniformly in {0, 1, . . . , 2t — 1} and with p 
replaced by 2~ n p to get 

2t-l 

5^P[V CT H] < 2 4 "' +1 V 4 (iV + e(t + 2) 2 ). (3.7) 
Now, observe that 

cr — 1 n— 1 

/] lx>(T,a,k) < 3 + lv(n) n j 

T=0 fc = 
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since X'(t, a, k) can hold for at most one r. Therefore, by taking expectations 
and applying (13.71) we get 

2t-l ct-1 n-1 
<j=0 r=0 k=0 

2t-l 

< 6t + n^P[V CT (n)] < 0{t) + 0{l)2 An np- A {N° + et 2 ). 

<7=0 

We now assume that 2 n <t p. Then the inequalities (I3.6P may be applied to 
the above, giving 

n-l t-l r+\2 k - 2 /p\ 

EE E 2- fc -yP[2/I>p] <0(t)+0(l)2 An np-*(N° + et 2 ). 

k=0 t=0 a=r+l 

This implies (13.51) . and completes the proof. □ 

Corollary 3.4. Let 7 G (0,1/2). Lei q be a random variable with val- 
ues in N which is independent from the Markov chain (E^Z*). Set rj : = 
max{P[g = t] : t G N}, and suppose that (e) 1-7 < f] < (e) 7 / max{iV , 1}. 
Then for all A > 1 and p > 

P[yl>p] <P[q> ArT 1 ] +C(A/p)|logep 1 , 
where C is a constant depending only on 7. 
Proof. Let s := [AtE 1 ]. We have 

s 

P [y\ > p] < P [q > A ry- 1 ] + ^ P [g = t] P [yl > p] 

T=0 

<P[g>Ar / - 1 ]+r ? ^P[y[>p]. 

T = 

Thus, the proof is completed by applying (13. 5p with t = s + 1 and n := 
L| loge|/CJ, provided that with sufficiently large C we have 

2 4n p" 5 (N°/t + et) <C p- 1 I log e\ ^ (3.8) 
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and 2 n < t p. First, note that we may assume that A,p _1 < loge and 
e < 1/10. Then 2" < t p holds by the assumptions on 77. It is also easy to 
verify that with an appropriate choice of C (13.81) follows from our inequalities 
for r\ and assumptions about p and A. □ 

Theorem 3.5. Let Y° and Z° be independent random samples from PD(1), 
and let (Y*, Z l ) denote their evolution under M . Let to G N +; and let q G 
{0, 1, . . . , £q — 1} be chosen uniformly and independently from the evolution 
of the chain (Y f , Z f ). Then for each p > 0, 

max{yl,zl}>p] < 0(1) p" 1 (logto)" 1 . 

Proof. Set e := (to) -2 , and define e as in Lemma 13.11 Recall that a size 
biased sample from the PD(1) sample Y° gives the uniform distribution 
on [0,1] (this is well-known, but also easy to verify from the definition). 
Consequently, E[e] = 3e. Let A\ be the event that e < e 3 / 4 . Then P[-i^4i] < 
3c 1 / 4 . Let Ai be the event that iV° < e -1 / 4 . It is easy to see (e.g., using 
the description of PD(1) from the introduction) that P[-v4.2] < 0(e). (In 
fact, A^°/|loge| is very unlikely to be large.) Define r] as in Corollary 13.41 
Then 77 = e 1//2 and on A\ H A2 we have (e) 1-7 < 77 < (e) 7 / max{A^°, 1} with 
7 = 1/5, for example. On the event A\ fl A2, apply the corollary with A = 1 
and the corresponding statement with the roles of Y and Z switched, to get 

P maxjj/^J} > p AiHA 2 < 0{p~ l ) \ loge| -1 . 

Now our estimates for Pf-i^x] and Pf-i^] complete the proof. □ 

Proof of Theorem 11.21 The proof is similar to the proof of Theorem 13.51 
Let /1 be a measure that is invariant under M, and let Y° be a sample from 
p. Let Z° be a sample from PD(1) (which we may take to be independent 
from Y°, though this is not important). Let t G N+, to > 5, and let q be as 
in Theorem 13.51 As in the proof of that theorem, choose e = tg 2 ■ 

Note that for every t G N, Y l is also a sample from p, because p is t 
invariant. The same also holds for Y q , since q is independent from the chain 
(Y*). 

We now explain how to get bounds on the distributions of e and iV° using 
continuous analogs of Lemmas 12.31 and 1 2.41 Let /3(s,Y) := J2V^i '■ ^ — s }- 
Since lim s \o /3(s, Y°) = a.s., we may choose k = k(e) > sufficiently large 
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so that P[(3{2- k ,Y°) > e] < e and 2~ k < e(l - e)/8. Set 5 = 1 - 2~ k and 
ti = [2 6 <5 _1 A; 2 fe ] . The proof of Lemma [231 applied to the continuous setting 
gives 



E 



/^y* 1 ) f3{2~ k ,Y°)<e 



By our choice of k this implies E[/3(e, l^* 1 )] < 0(e) | loge|. Since F* 1 and y° 
have the same distribution, this gives 



E [/?(e, y°)] < 0(e) | log ( 



(3.9) 



and since E[/3(e, Z )] = e, as in the proof of Theorem 13.51 we conclude that 
E[e] < 0(e) | log e|. 

We now adapt the latter part of the proof of Lemma 12.41 Let itlq := 
|~| log 2 e|] . On the one hand 



'»() 



mo 



£ 2 m (3(2- m , Y°) = J^i 2 " 1 Yi ' Yi - 2 ~ m } < 2 {i G N+ : F, > e} 



m=0 



m=0 



On the other hand, (13. 9p gives 



E 



mo 



-m yO 



mo 



< 0(1) ^m = 0(l)|loge| 2 . 



m=0 



m=0 



Consequently, we have E[iV ] = 0(1) | loge| 2 . Now, the proof of Theorem l3.5l 
applies, and gives for all p > 

P[max{yl,4} > p] < 0(1) p' 1 (log^ 1 • 

We conclude that for every p > there is a coupling of Y° and Z° so that 
P [max{y°, z\} > p] < p. This implies that p = PD{1). □ 



4 Conclusion 

Proof of Theorem 11.11 The proof is similar to the proof of Theorem 13.51 
Let e > 0. Let q be uniformly chosen in (2 Z) D [0, e -1 / 2 ]. Set z = z{2 t/n), as 
in (11.11) . Let Z° be chosen according to PD(1), and let Y T = X(n t+T )/(n z). 

We now apply a coupling of Z T and Y T similar to the coupling M given 
in Section [3J There are a few minor necessary modifications in the definition 
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of the coupling. First, note that the entries of Y T do not sum to 1 but to 
1/z. Thus, the random variables u and v needed in the transition kernel 
for M should be uniform in [0, 1/z]. In Y T and Y T we put those segments 
corresponding to cycles that do not intersect Vq in the very end; that is, 
roughly in the interval [1, 1/z). When u or v turn out to be outside of [0, 1], 
we make no transition to Z\ that is, Z T+1 = Z T , in this case. 

Another modification is necessary because the transitions of Y T are dis- 
crete. Thus, the actual size of the splits occuring in the transitions of Y 
would be determined with \nzv~\/{nz). The definition of the matching be- 
tween entries in Y T and entries in Z T need to be modified as well. When a 
split is made in both Y T and Z T , pieces which would have been exactly the 
same, may differ slightly now, because of the discretization in the transition 
of Y. This difference is of order 1/n, and may be safely ignored. Although 
these errors may accumulate over time, when matched pieces are merged and 
split, the total discrepancy would still be small, since we take n much larger 
than e -1 / 2 , which bounds q. 

Lemma 12.41 gives us good control on iV° while ( 12.41) gives a bound on the 
probability that e is large. Consequently, the proof of Theorem 13.51 shows 
that for all p > if n is large 

P [\\Y* - > p] < 0(1) p~ l | log el" 1 . (4.1) 

Thus, the statement of Theorem 11.11 is obtained with t replaced by t + q. If 
we consider e and p as fixed, then q is bounded. Since q is even, the following 
lemma completes the proof. □ 

Lemma 4.1. Let t > cn, c > 1/2. As n — > oo ; the total variation distance 
between the law of 3t(ivt) an d the law o/X(7r t+2 ) tends to zero. 

First, we give a slightly informal proof. Note that when the largest entry 
in %(tt t ) is not too small, there is probability bounded away from and 1 
that X(tt t ) = X(7r r+2 ), because that entry may split and then recombine. We 
know that for many r G [n/2, t] the largest entry is not small. Consequently, 
there is a random "delay", which implies the statement of the lemma. For 
readers who are not convinced yet, we offer a proof with more details. 

Proof. Set W T = X(n T ). To prove that W t and W t+2 have close distri- 
butions, we couple the chain iW T ) with a chain (U T ) which has the same 
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distribution as (W T ). In essence, the two chains will be the same; the sig- 
nificant difference involves a random shift in time. Set r = Tq = 0. In- 
ductively, suppose that t% and r[ have beed defined such that W n = U Ti . 
Let m = vrti be the largest integer such that W n+2 i = W n for all j = 
1,2, ... ,m. The distribution of m conditioned on W Ti is geometric; that is, 
P \m = k | W Ti ~\ = (1—p) p k , where p = pi = P [m > | W Ti ~\ . Similarly, the 
largest integer m' such that U T ' i+2 ^ = U T i for all j = 1, 2, . . . , m! has the same 
conditioned distribution: P[m' = k | W Ti ~\ = P[m' = | £/ r »] = (1 — p)p k . 
We now couple m and m'. If r/ = n + 2, take m' = m. Otherwise we 
couple m and m' so that \m — m'\ < 1, but m ^ vn! happens quite fre- 
quently. For example, for all k G N take (m, m') = (k, fc + 1) with probability 
p k+1 (l — p)/(l + p), (m,m') = (k + l,k) with the same probability, and 
(m,m') = (0,0) with probability (1 — p)/(l +p), all conditioned on W n . In 
this case, the conditioned probability that m' — m = ±1 is 2p/(l + p). In 
either case, take U T ^ + ^ = W Ti+ ^ for j = 1,2,..., min{2m, 2m'}. If m' > m 
let U T i +2m+l be independent from the chain (W T ) given (W n ,m,m'), and 
similarly if m > m' . Clearly, W n+2m = U< +2m ' . Take U T > +2m ' +j = W n+2m+j 
for j = 1, 2, r i+ i := Tj + 2m + 2 and r/ +1 := r- + 2m' + 2. Then continue 
inductively. This completes the specification of the coupling. 

It clearly suffices to prove that with probability tending to 1 as n — > oo, 
we have t[ — Ti + 2 with some Tj < t. First, observe that the Pi are bounded 
away from 1. This guarantees that a.a.s. Tj = 0(i). Now note that p^ is 
bounded away from zero by some positive function of Wp/n (the largest 
entry normalized), because that largest entry may split in the next step, and 
then the same two parts may merge in the step after that. We know, for 
example from (12 .4p , that with high probability for most values of i such that 
t > Tj > (cn + n/2)/2, the largest entry of W Ti is not too much smaller than 
n. Consequently, a.a.s. we have Pi bounded away from zero for many values 
of i satisfying n < t. Similarly, m« ^ m^ for many values of i. Note that 
{ji — r-)/2 is a martingale, and its increments are { — 1,0, 1}. By removing 
the increment steps, the martingale may be coupled with a simple random 
walk on Z. The martingale starts at 0. Thus, the probability that many ±1 
steps are performed and it never gets to 1 tends to 0. This completes the 
proof. □ 

Acknowledgments: I have had the pleasure to benefit from conversations 
with Rick Durrett, Michael Larsen, Russ Lyons, David Wilson and Ofer 
Zeitouni in connection with this work. Nathanael Berestycki has kindly 



24 



pointed out an error in the proof of Lemma 13.31 in a previous version of 
this paper. 



References 

[Ang03] Omer Angel. Random infinite permutations and the cyclic time 
random walk. In C. Banderier and C. Krattenthaler, editors, 
Random Walks and Discrete Potential Theory, Discrete Math- 
ematics and Theoretical Computer Science, pages 9-16, 2003, 
http: / / dmtcs.loria.fr/proceedings/html/dmAC0101.abs.html. 

Noga Alon and Joel H. Spencer. The probabilistic method. Wiley- 
Interscience Series in Discrete Mathematics and Optimization. 
Wiley- Interscience [John Wiley Sz Sons], New York, second edi- 
tion, 2000. With an appendix on the life and work of Paul Erdos. 

[BD] Nathanael Berestycki and Rick Durrett. A phase transition in the 

random transposition random walk, arXiv:math. PR/0403259. 

[DMP95] Persi Diaconis, Michael McGrath, and Jim Pitman. Riffle shuffles, 
cycles, and descents. Combinatorica, 15(l):ll-29, 1995. 

[DMWZZ] Persi Diaconis, Eddy Mayer-Wolf, Ofer Zeitouni, and Mar- 
tin Zerner. The Poisson-Dirichlet law is the unique in- 
variant distribution for uniform split-merge transformations, 
arXiv:math.PR/0305313. 

[DS81] Persi Diaconis and Mehrdad Shahshahani. Generating a random 
permutation with random transpositions. Z. Wahrsch. Verw. Ge- 
biete, 57(2):159-179, 1981. 

[Dud89] Richard M. Dudley. Real analysis and probability. Wadsworth 
& Brooks/Cole Advanced Books & Software, Pacific Grove, CA, 
1989. 

[HolOl] Lars Hoist. The Poisson-Dirichlet distri- 

bution and its relatives revisited, 2001, 
http://www.math.kth.se/matstat/fofu/reports/PoiDir.pdf. 
preprint. 



[ASOO] 



25 



[JLROO] Svante Janson, Tomasz Luczak, and Andrzej Rucinski. Random 
graphs. Wiley-Interscience Series in Discrete Mathematics and 
Optimization. Wiley-Interscience, New York, 2000. 

[Spe94] Joel Spencer. Ten lectures on the probabilistic method, volume 64 
of CBMS-NSF Regional Conference Series in Applied Mathemat- 
ics. Society for Industrial and Applied Mathematics (SIAM), 
Philadelphia, PA, second edition, 1994. 

[T6t93] Balint Toth. Improved lower bound on the thermodynamic pres- 
sure of the spin 1/2 Heisenberg ferromagnet. Lett. Math. Phys., 
28(l):75-84, 1993. 

[Wat 76] G. A. Watterson. The stationary distribution of the infinitely- 
many neutral alleles diffusion model. J. Appl. Probability, 
13(4):639-651, 1976. 



26 



